Generalizing Syntactic Collocates for Creative Language Generation
نویسنده
چکیده
This paper presents the construction of a data source that supports the automatic generation of cryptic crossword clues in a system called ENIGMA. Cryptic crossword clues have two layers of meaning: a surface reading that appears to be a fragment of English prose, and a puzzle reading that the solver must uncover to solve the clue. The content expressed by the clue, and the input to the generation process, is a word play puzzle, such as an anagram, perhaps. In expressing this puzzle ENIGMA must choose language creatively, so that a separate, surface reading of the text is also generated – in effect translating a semantic input via a layered text to a new semantic output. To ensure that this surface text is meaningful, ENIGMA uses corpus data to determine which words can be combined meaningfully and which cannot.
منابع مشابه
Rule-Based Extraction of English Verb Collocates from a Dependency-Parsed Corpus
We report on a rule-based procedure of extracting and labeling English verb collocates from a dependency-parsed corpus. Instead of relying on the syntactic labels provided by the parser, we use a simple topological sequence that we fill with the extracted collocates in a prescribed order. A more accurate syntactic labeling will be obtained from the topological fields by comparison of correspond...
متن کاملComparison of right hemisphere damage patients and normal adults in some linguistic performances
Introduction: According to some evidence, damage to the right hemisphere leads to impaired linguistic and cognitive functions. Patients with right hemisphere damage (RHD) experience difficulties at different levels of language. Assessing and diagnosing language disorders in RHD patients help to plan treatment programs. Therefore, the present study investigated some of the language functions in ...
متن کاملSyntactic realization with data-driven neural tree grammars
A key component in surface realization in natural language generation is to choose concrete syntactic relationships to express a target meaning. We develop a new method for syntactic choice based on learning a stochastic tree grammar in a neural architecture. This framework can exploit state-of-the-art methods for modeling word sequences and generalizing across vocabulary. We also induce embedd...
متن کاملBuilding a Collocational Semantic Lexicon
Natural Language Generation (NLG) systems require access to collocational information to help determine lexical choices constrained both by syntactic and semantic concerns. Constructing linguistic resources to support these decisions can be time-consuming whereas, if the information is extracted automatically, data sparsity limits the variety of the output. This paper reports on a method for ex...
متن کاملGeneration of Word Profiles on the basis of a large and balanced German corpus
Electronic corpora have been used in lexicography and the domain of language learning for more than two decades (cf. Braun et al. 2006, Sinclair 1991). Traditionally, computer platforms exploiting these corpora were based on concordances that present a word in its different contexts. However, concordances hit their limits for very large corpora where the result sets are generally too large for ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007